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PERTUSSIS TOXIN GENE: CLONING AND EXPRESSION 
OF PROTECTIVE ANTIGEN 
This is a continuation in part of the 
application serial number 07/843,727 filed March 25, 1986. 

The present invention is related to molecular 
5 cloning of pertussis toxin genes capable of expressing an 
antigen peptide having substantially reduced enzymatic 
activity while being protective against pertussis. More 
particularly, the present invention is related to 
bacterial plasmids pPTX42 and pPTXSl/6A encoding pertussis 
10 toxin. 

State of The Art 

Pertussis toxin is one of the various toxic 
components produced by virulent Bordetella pertussis . the 

15 microorganism that causes whooping cough. A wide variety 
of biological activities such as histamine sensitization, 
insulin secretion, lymphocytosis promoting and immuno- 
potentiating effects can be attributed to this toxin. In 
addition to these activities, the toxin provides 

20 protection to mice when challenged intracerebral ly or by 
aerosol. Pertussin toxin is, therefore, an important 
constituent in the vaccine against whooping cough and is 
included as a component in such vaccines. 

However, while this is one of the major 

25 protective antigens against whooping cough, it is also 

associated with a variety of pathophysiological activities 
and is believed to be the major cause of harmful side 
effects associated with the present pertussis vaccine. In 
most recipients these side effects are limited to local 

30 reactions, but in rare cases neurological damage and death 
does occur (Baraff et al, 1979 in Third International 
Symposium on Pertussis. U.S. HEW publication No. NIH-79- 
1830) . Thus a need to produce a new generation of vaccine 
against whooping cough is evident. 

35 
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SUMMARY OF THE INVENTION 



It is, therefore, an object of the present 
invention to clone the gene(s) responsible for expression 
of pertussis toxin. 
5 It is a further object of the present invention 

to isolate at least a part of the pertussis toxin genome 
and determine the nucleotide sequence and genetic 
organization thereof. 

It is yet another object of the present 
10 invention to characterize the toxin polypeptide encoded by 
the cloned gene(s) , at least in terras of the amino acid 
sequence thereof. 

Others objects and advantages of the present 
invention will become evident upon a reading of the 
15 detailed description of the invention presented herein. 

BRIEF DESCRIPTION OF DRAWINGS 

These and other objects, features and many of 
the attendant advantages of the invention will be better 
understood upon a reading of the following detailed 
description when considered in connection with the 
accompanying drawings wherein. 

Fig. 1 shows SDS-electrophoresis of the products 
of HPLC separation of pertussis toxin. Lanes 1 and 12 
contain 5 /*g and 10 pq, respectively, of unf ractionated 
pertussis toxin. Lanes 2 through 11 contain 100 (il 
aliquots of elution fractions 19 through 28, respectively. 
The molecular weights of the subunits are indicated: 

Fig. 2 shows restriction map of the cloned 4.5 
kb EcoRI/BamHI Bj. pertussis DNA fragment and genomic DNA 
in the region of the pertussis toxin subunit gene. (a) 
Restriction map of a 2 6 kb region of B^ pertussis genomic 
DNA containing pertussis toxin genes. (b) Restriction map 
of the 4.5 kb EcoRI/BamHI insert from pPTX42. The arrow 
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indicates the start and translation direction of the 
mature toxin subunit. The location of the Tn5 DNA 
insertion in mutant strains BP356 and BP357 is shown. (c) 
PstI fragment derived from the insert shown in panel b; 

Fig. 3 shows Southern blot analysis of L. 
pertussis genomic DNA with cloned DNA probes. (a) Total 
genomic DNA from strain 3779 was digested with various 
restriction enzymes as indicated on the figure, and 
analyzed by Southern blot using nick translated PstI 
fragment B of pPTX42 (see Fig. 2c) . (b) Between 24 fig and 
60 /ig of genomic DNA from strains 3779, Sakairi (pertussis 
toxin", Tn5") , BP347 (non-virulent, Tn5 + ) , BP349 (heroolysin" 
, Tn5 + ) , BP353 (filamentous hemagglutinin", Tn5 + ) , BP356 
and BP357 (both pertussis toxin", Tn + ) (15) (lanes 1 
through 7, respectively) were digested with PstI and 
analyzed by Southern blot using nick translated PstI 
fragment B as the probe. (c) The same as panel b except 
PstI fragment C was used as the probe; 

Fig. 4 shows the physical map and genetic 
organization of the Pertussis Toxin Gene. (a) Restriction 
map of the 4.5 kb EcoRI/BamHI fragment from pPTX42 
containing the pertussis toxin gene cloned from B^. 
pertussis strain 3779 (12) . The arrow indicates the 
position of the Tn5 DNA insertion in pertussis toxin 
negative Tn5-induced mutant strains BP356 and BP357 (24) . 
(b) Open reading frames in the forward direction. c) Open 
reading frames in the backward direction. The vertical 
lines indicates termination codons. d) organization map 
of the pertussis toxin gene. The arrows show the 
translational direction and length of the protein coding 
regions for the individual subunits. The hatched boxes 
represent the signal peptides. The solid bars in si 
represent the regions homologous to the A subunits in 
cholera and ELa. coli heat labile toxins; and 

Fig. 5 shows the physical map of the pertussis 
toxin S4 subunit gene, a) Restriction map of the 4.5 
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kilobase pair (kb) EcoRI/BamHI fragment inserted into 
PMC1403. b) Detailed restriction map and sequencing 
strategy of the PstI fragment B containing the S4 subunit 
gene. Only the restriction sites used for subcloning 
prior to sequencing are shown. Closed circle arrow shows 
the sequencing using dideoxy chain termination and open 
circled arrows show the sequencing strategy using base- 
specific chemical cleavage. The arrows show the direction 
and the length of the sequence determination. The heavy 
black line represents the S4 coding region, c) Open 
reading frames in the three forward directions. d) Open 
reading frames in the three backward directions. The 
vertical lines indicate termination codons. 

DETAILED DESC RIPTION OF INVENTION 

The above objects and advantages of the present 
invention are achieved by molecular cloning of pertussis 
toxin genes. The cloning of the gene provides means for 
genetic manipulation thereof and for producing new 
generation of substantially pure and isolated form of 
antigenic peptides (toxins) for the synthesis of new 
generation of vaccine against pertussis. Of course, such 
manipulation of the pertussis toxin gene and the creation 
of new, manipulated toxins retaining antigenicity against 
pertussis but being devoid of undesirable side effects was 
not heretofore possible. The present invention is the 
first to clone the pertussis toxin gene in an expression 
vector, to map its nucleotide sequence and to disclose the 
finger print of the polypeptide encoded by said gene(s). 

Any vector wherein the gene can be cloned by 
recombination of genetic material and which will express 
the cloned gene can be used, such as bacterial (e.g., 
gtll) , yeast (e.g. pGPD-1) , viral (e.g.pGS 20 or pMM4) and 
the like. A preferred vector is the microorganism coli 
wherein the pertussis gene has been cloned in the plasmid 
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° thereof . 

Although any similar or equivalent methods and 
materials could be used in the practice or testing of the 
present invention, the preferred methods and materials are 
now described. All scientific and/or technical terms used 
5 herein have the same meaning as generally understood by 
one of ordinary skill in the art to which the invention 
belongs. All references cited hereunder are incorporated 
herein by reference. 



10 MATERIALS AND METHODS 

Materials . Restriction enzymes were purchased 
from Bethesda Research Laboratories (BRL) or International 
Biotechnologies, Inc. and used under conditions 
recommended by the suppliers. T4 DNA ligase, M13mpl9 RF 

15 vector, isopropylthio-/3-galactoside (IPTG) , 5-bromo-4- 
chloro-3-indolyl-/3-D-galactoside (X-Gal) , the 17-bp 
universal primer, Klenow fragment (Lyphozyme R ) and T4 
polynucleotide kinase were purchased from BRL. Calf 
intestine phosphatase was obtained from Boehringer 

20 Mannheim, nucleotides from PL-Biochemicals and base 

modifying chemicals from Kodak (dimethylsulf ate, hydrazine 
and piperidine) and EM Science (formic acid) . Plasmid 
pMC1403 and E. coli strain JM101 (supE, thi, a (lac-proAB) , 
[F', traD36, proAB, lad Z aM15]) were obtained from Dr. 

25 Francis Nano (Rocky Mountain Laboratories, Hamilton, 
Montana) . Elutip-d R columns came from Schleicher & 
Schuell and low melting point agarose from BRL. 
Radiochemicals were supplied by ICN Radiochemicals (crude 
7- 32 P]ATP, 7000 Ci/mmol) and NEN Research Products ( [«- 

30 32 P]dGTP, 800 Ci/mmole) . B^. pertussis strain 3779 was 
obtained from Dr. John J. Munoz, Rocky Mountain Lab, 
Hamilton, Montana. This strain is also known as 3779 
BL2S4 and is commonly available. 
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Purification of Pe rtussin Toxin Subunits: 
Pertussis toxin from EL Pertussis strain 3779 
was prepared by the method of Munoz et al . Cell Immunol. 
§3:92-100, 1984. Five mg of the toxin was resuspended in 
trif luoroacetic acid and fractionated by high pressure 

5 liquid chromatography, HPLC, using a 1 x 25 cm Vydac C-4 
preparative column. The sample was injected in 50% 
trif luoroacetic acid and eluted at 4 ml/rain over 3 0 min 
with a linear gradient of 25% to 100% acetonitrile 
solution containing 66% acetonitrile and 33% isopropyl 

10 alcohol. All solutions contained 0.1% trif luoroacetic 
acid. Elution was monitored at 220 ran and two ml 
fractions collected. Aliquots of selected fractions were 
dried by evaporation, resuspended in gel loading buffer 
containing 2-mercaptoethanol and analyzed by sodium 

15 dodecylsulphate polyacrylamide gel electrophoresis, SDS- 
PAGE, on a 12% gel. 

Protein and DNA Se quencing; The polypeptide 
from HPLC fraction 21 (Fig. 1, lane 4} was sequenced using 
a Beckman 890C automated protein sequenator according to 

20 the methods described by Howard et al, Mol. Biochem. 

Parasit. 12:237-246, 1984. DNA was sequenced from the 
Smal site (see Fig. 2b) by the Maxam and Gilbert technique 
as described in Methods in Enzymol. 65:499-560, 1980. 

25 Isolation of Pertussis Toxin Genes: Chromosomal 

DNA was prepared from B. pertussis strain 3779 following 
the procedure described by Hull et al, Infec. Immunol. 
33:933, 1981. The DNA was digested with both 
endonucleases EcoRI and BamHI and ligated into the same 

30 sites in the polylinker of pMC1403 as described by 

Casadaban et al . J. Bacterid. 143:971-980, 1983; Maniatis 
et al . Molecular Cloning: A Laboratory Manual, 1982. The 
conditions for ligation were: 60 ng of vector DNA and 4 0 
ng of inset DNA incubated with 1.5 units of T4 DNA ligase 

35 (BRL) and 1 mM ATP and 15 °C for 20h. E. coli JM109 cells 
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0 were transformed with the recombinant plasmid in 

accordance with the procedure of Hanahan . J. Mol. Biol. 
166:557-580, 1983 and clones containing the toxin gene 
identified by colony hybridization at 37 °C using a 32 P- 
labeied 17-base mixed oligonucleotide probe 21D3 following 
5 the procedure of Woods, Focus 6:1-3, 1984. The probe was 
synthesized on a SAM-1 DNA synthesizer (Biosearch, San 
Rafael, California) and consisted of the 32 possible 
oligonucleotides coding for 6 consecutive amino acids of 
the pertussis toxin subunit (Table 1) . The probe was 

10 purified from a 20% urea-acrylamide gel and 5 '-end labeled 
using 0.2 mCi of (gamma 32 P) ATP (ICN, crude, 7000 Ci/mmol) 
and 1 unit of T 4 polynucleotide kinase (BRL) per 10 fil of 
reaction mixture in 50 mM Tris-HCl (pH 7.4) 5 mM DTT, 10 
mM MgCl 2 . The labeled oligonucleotides were purified by 

15 binding to a DEAE-cellulose column (DE52, Whatman) in 10 
mM Tris-HCl (pH 7.4), ImM EDTA (TE) and eluted with 1.0 M 
NaCl in TE. Ten positive clones were isolated and 
purified. Plasmid DNA from these clones were extracted 
according to the procedure of Maniatis et al . Molecular 

20 Cloning: A Laboratory Manual, 1982, digested with routine 
restriction endonucleases (BRL), and then analyzed by 0.8% 
agarose gel electrophoresis in TBE (10 mM Trisborate pH 
8.0, 1 mM EDTA). Southern blot analysis using the 32 P- 
labeled oligonucleotide 21D3 as the probe showed that all 

25 10 clones contained an identical insert of B. pertussis 

DNA. One clone was used for further analysis by Southern 
blots (Fig. 3) and for DNA sequencing. 



Southern Blot Analyses; Extracted DNA as 
30 described supra, was digested and separated by 

electrophoresis using either 0.7% or 1.2% agarose gels in 
40 mM Tris-acetate pH 8.3, 1 mM Tris-acetate pH 8.3, 1 mM 
EDTA for 17 h at 3 0 V. The DNA was then blotted onto 
nitrocellulose in 2 OX SSPE, sodium chloride, sodium 
35 phosphate EDTA buffer, pH 7.4, in accordance with Maniatis 
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° et al., su pra . and baked at 80°C in a vacuum oven for 2 h. 
Filters were prehybridized at 68 °C for 4 h in 6X SSPE, 
0.5% SDS, 5X modified Denhardt's (0.1% Ficoll 400, 0.1% 
bovine serum albumin, 0.1% polyvinylpyrrolidone and 0.3X 
SSPE) and 100 ftg/ml denatured herring sperm DNA. The 
5 hybridization buffer was the same as the prehybridization 
buffer, except EDTA was added to a final concentration of 
10 mM. PstI fragments A, B, C and D were isolated by 0.8% 
low-melting point agarose gel electrophoresis, purified on 
Elutip-d columns (Schleicher and Schuell) and nick 

10, translated (BRL) using (alpha 32 P) CTP (800 Ci/mmol, NEN 
Research Products) . The nick translated probes were 
hybridized at a concentration of about l fiCi/ml for 48 h 
at 68°C. Filters were then washed in 2X SSPE and 0.5% 
SDS at room (22°-25°C) temperature for 5 min, then in 2X 

15 SSPE and 0.1% SDS at room temperature for 5 min, then in 
2x SSPE and 0.1% SDS at room temperature for 15 min, and 
finally in 0.1X SSPE and 0.5% SDS at 68 °C for 2 h. The 
washed filters were air dried and exposed to x-ray film 
using a Lightning-Plus intensifying screen following 

20 standard technigues. 

Isolation and clo ning of S4 subunit gene t As 
mentioned above, purified pertussis toxin from B. 
pertussis strain 3779 was fractionated by high pressure 

25 liquid chromatography (HPLC) . One fraction (F-21) 

contained a polypeptide which comigrated as a major band 
with subunit S4 on SDS-PAGE (Fig. 1, lane 4). Although 
complete separation was not achieved, the major portion of 
the other toxin subunits were recovered in other HPLC 

30 fractions, i.e., S2 in Fr22, SI and S5 in Fr23, and S3 in 
Fr24 (Fig. 1) . The amino acid sequence of the first 3 0 
NH 2 -terminal residues of the protein in fraction 21 was 
determined and is shown in Table l. 
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° Based on the protein sequence shown in Table l, 

a mixed oligonucleotide probe representing a region of six 
consecutive amino acids with the least redundancy of the 
genetic code was synthesized. In this mixture of 
oligonucleotides, identified as probe 21D3, approximately 

5 1 out of 32 molecules corresponds to the actual DNA 
sequence of the pertussis toxin gene (Table 1) . This 
mixed oligonucleotide probe was used to screen a DNA clone 
bank containing restriction fragments of total pertussis 
chromosomal DNA. The clone bank was prepared by digesting 

10 genomic DNA isolated from B. pertussis strain 3779 with 
both EcoRI and BamHI restriction endonucleases. The 
complete population of restriction fragments was ligated 
into the EcoRI /BamHI restriction site of expression vector 
pMC1403 and the recombinant plasmid used to transform E. 

15 coli JM109 cells following standard procedures well known 
in the art. It is noted that although E. coli is the 
preferred organism, other cloning vectors well known in 
the art, could, of course, be alternatively used. 

Approximately 20,000 colonies were screened by 

20 colony hybridization using the 32P-end labeled 

oligonucleotide probe 21D3. The plasmid DNA of 10 
positive colonies was examined by restriction enzyme and 
Southern blot analyses. All 10 colonies contained a 
recombinant plasmid with an identical 4.5 kb EcoRI /BamHI 

25 pertussis DNA insert. One of these clones, identified as 
pPTX42, was selected for further characterization. A 
restriction map of the insert DNA was prepared and is 
shown in Figure 2b; Southern blot analysis indicated that 
the oligonucleotide probe 21D3 hybridized to only the 0.8 

30 kb Smal/PstI fragment. 

A deposit of said pPTX42 clone has been made in 
American Type Culture Collection, Rockville, MD under the 
accession No. 67046. This culture will continue to be 
maintained for at least 3 0 years after a patent issues and 

35 will be available to the public without restriction, of 
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° course, in accordance with the provisions of the law. 

Sequencing of the N,H-terminal region for S4 : 
The 0.8 kb fragment was isolated by agarose gel 
electrophoresis and sequenced using the Maxam and Gilbert 
5 technique, supra . The DNA sequence was translated into an 
amino acid sequence and a portion of that sequence is 
compared in Table 1 to the NH 2 -terminal 3 0 amino acids of 
the pertussis toxin subunit and the oligonucleotide probe 
21D3 sequence. Out of the sequence of 30 amino acid 

10 residues determined using the automated sequenator, only 2 
do not correspond to the amino acid sequence deduced from 
the DNA sequence, i.e., residues 24 and 26 are 
questionable because they repeat the amino acid in front 
of them and they are located near the end of the analyzed 

15 sequence. Amino acid 15 could not be determined. The 

rest of the deduced amino acid sequence perfectly matches 
the original protein sequence. The oligonucleotide probe 
sequence also perfectly matches the cloned DNA sequence. 
These results indicate that at least one of the pertussis 

20 toxin subunit genes has been cloned. 

Examination of the DNA sequence indicates that a 
precursor protein, perhaps containing a leader sequence, 
may exist (Table l) . In fact, the NH 2 -terminal aspartic 
acid of the mature protein is not immediately preceded by 

25 one of the known initiation codons, i.e., ATG, GTG, TTG, 
or ATT, but by GCC coding for alanine, an amino acid that 
often occurs at the cleavage site of a signal peptide. A 
proline is found at amino acid position -4, which is also 
consistent with cleavage sites in other known sequences 

30 where this amino acid is usually present within six 
residues of the cleavage site. Possible translation 
initiation sites in the same reading frame as the mature 
protein and upstream of the NH 2 -terminal aspartic acid 
are: ATG at position -9, TTG at -15 and GTG at -21; 

35 however, none of these are preceded by a Shine/Dalgarno 
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° ribosomal binding site (Nature, London, 254:34-38, 1975) 
and only CTG at -21 is immediately followed by a basic 
amino acid (arginine) bacterial signal sequences. Using 
the DNA sequence data and primer extension to sequence the 
mRNA, the actual initiation site could also be determined. 

5 

Physical mapping of the S4 aene on the bacterial 
chromosome ; The 1.3 kb PstI fragment B containing at 
least part of the pertussis toxin gene was used as a probe 
to physically map the location of this gene on the B. 

10 pertussis genome (Fig. 2) . Figure 3a shows a Southern 
blot analysis of total B. pertussis DNA digested with a 
variety of six base pair-specific restriction enzymes and 
probed with the 1.3 kb PstI fragment B isolated from 
pPTX42. Each restriction digest yielded only one DNA band 

15 which hybridized with the probe. Since the 1.3 kb PstI 
fragment B contains a Smal site, two bands would be 
expected from a Smal digest of genomic DNA unless the Smal 
fragments were similar in size. Further analysis 
indicated that the single band seen in the Smal digest is 

20 actually a doublet of two similar size DNA fragments. In 
this particular gel, fragments of 1.3 kb and smaller 
migrated off the gel during electrophoresis and thus could 
not be detected; however, in other Southern blots in which 
no fragment was run off the gel, only one band was found 

25 for each restriction enzyme. These results indicate that 
the gene encoded by the PstI fragment B occurs only once 
in the genome. Using the data from these experiments and 
similar studies using the 1.5 kb PstI fragment A and the 
0.7 kb Pstl/BamHI fragment D from the cloned 4.5 kb 

30 EcoRI/BamHI fragment D from the cloned 4.5 kb EcoRI/BamHI 
fragment, a partial restriction map of a 26 kb region of 
the pertussis genome as shown in Figure 2a was obtained. 
This method allowed to locate the first restriction site 
of a particular endonuclease on either side of the 4 . 5 kb 

35 EcorRi/BamHI fragment. This information is useful in 
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° deciphering the genetic arrangement of the toxin gene and 
for the cloning of larger DNA fragments of pertussis 
toxin. 

Relationship of the S4 gene and Tn 5-insertions : 
5 Weiss et al., Infect. Immun. 42:33-41, 1983, have 

developed several important Tn5-induced B. pertussis 
mutants deficient in different virulence factors, i.e., 
pertussis toxin, hemolysin, and filamentous hemagglutinin 
(Infect. Immun. 43:263-269, 1984; J. Bacteriol. 153:304- 

10 309, 1983) . To investigate the physical relationship 
between the Tn5 DNA insertion and the pertussis toxin 
subunit gene, genomic DNA from these mutants and strain 
3779 by Southern blots using various restriction fragments 
of the cloned 4.5 kb EcoRI / BamHI DNA fragment as probes 

15 were analyzed. In one set of experiments, blots of 

genomic PstI fragments were separately probed with cloned 
PstI fragments A, B, C, and D (Fig. 2c) . The PstI 
fragments from the mutants and strain 3779 which 
hybridized with the cloned PstI fragments A, B, and D were 

20 exactly the same size; the blot probed with PstI fragment 
B is shown in Figure 3b. However, when the PstI fragment 
c was used as a probe, the genomic DNA from mutant strains 
BP356 and BP357 showed a clear difference in the size of 
the PstI fragments that hybridized as compared to strain 

25 3779 and the other mutant strains (Fig. 3c, lanes 6 and 
7). These results indicate that this fragment contains 
the site of the Tn5 insertion. As expected, two labeled 
fragments were found, since the Tn5 DNA insert has two 
symmetrical PstI sites. Other Southern blots (not shown) 

30 in which genomic Bglll and Smal fragments were hybridized 
with the 4.5 kb EcoRI /BamHI cloned probe, and the data 
from Figure 3c, clearly show that the Tn5 DNA was inserted 
1.3 kb downstream from the start of the mature pertussis 
toxin S4 subunit in the two mutant strains that were 

35 characterized as pertussis toxin negative phenotypes, 
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0 i.e., BP356 and BP357 (Fig. 2b). This insertion is beyond 
the termination codon for the S4 subunit (11.7 kD) . 
Examination of these toxin negative mutants by Western 
blots using monoclonal antibodies for individual subunits 
indicate that the Tn5 DNA is not inserted in the subunit 

5 structural genes for SI and S2 (unpublished results) . The 
pertussis toxin negative phenotype of strains BP356 and 
BP357 can be explained by either of two nonexclusive 
mechanisms. The Tn5 DNA may be inserted into the coding 
regions of either S3, S5, or perhaps another gene required 

10 for toxin assembly or transport. Alternatively, the Tn5 
insertion could disrupt the expression of essential 
downstream cistrons in a polycistronic operon. Similar 
Southern blot analyses of genomic BamHI and EcoRI 
fragments indicate that none of the other virulence factor 

15 genes represented by the other Tn5-insertion mutants, are 
located within the 17Kb region defined by the first BamHI 
and the second EcoRI sites as shown in Figure 2a. 

Nucleotide Sequence 

20 Having described the identification, isolation, 

and construction of recombinant plasmid pPTX42, containing 
pertussis toxin genes, the insert DNA from this plasmid, 
i.e., the 4.5 kb EcoRI /BamHI fragment shown in Fig. 4a, 
was digested with various restriction enzymes and 

25 subcloned by standard procedures (Maniatis et al., supra ) 
using the cloning vectors M13 mpl8 and M13 mpl9 and E. 
coli strain JM101 as described by Messing . Methods 
Enzymol. 101:20-78, 1983. Both strands of the DNA were 
sequenced using either the Maxam and Gilbert base-specific 

30 chemical cleavage method, supra . or the dideoxy chain 

termination method of Sanger et al., PNAS, 74:5463-5467, 
1977, with the universal 17-base primer, or both. The DNA 
sequence and the derived amino acid sequence were analyzed 
using MicroGenie™ computer software. 

35 Because of the high C+G content of B. pertussis 
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° DNA, it. was necessary to use both of the above mentioned 
methods with a combination of 8% and 20% polyacrylamide- 
8 M urea gels for sequence analysis. Each nucleotide has 
been sequenced in both directions on average of 4.13 
times. The final consensus sequence of the sense strand 
5 is shown in Table 2. It is noted that the sequence of the 
S4 subunit gene has been included in this table for 
completeness since this sequence lies in the middle of the 
structural gene sequence presented in Table 2. The entire 
sequence contains about 62.2% C+G with about 19.6% A, 
10 33.8% C, 28.4% G and 18.2% T in the sense strand, wherein 
A, T, C and G represent the nucleotides adenine, thymine, 
cytosine and guanine, respectively. 



15 
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Assignment of the subunit cistrons . 

The DNA sequence shown in Table 2 was translated 
in all six reading frames and the reading frames are shown 
in Fig. 4 b,c. The open reading frame (ORF) corresponding 
5 to the S4 subunit was identified and is shown in Fig. 4d. 
The assignment of the other subunits to their respective 
ORFs is based on the following lines of evidence: size of 
ORFs, high coding probability, deduced amino acid 
composition, predicted molecular weights, ratios of acidic 

10 to basic amino acids, amino acid homology to other 

bacterial toxins, mapping of Tn5-induced mutations, and 
partial amino acid sequence. 

Significant ORFs, long enough to code for any of 
the five toxin subunits, were analyzed by the statistical 

15 TESTCODE algorithm designed to differentiate between real 
protein coding sequences and fortuitous open reading 
frames in accordance with Fickett . Nucleic Acids Res. 
10:5303, 1982. The amino acid composition of each ORF 
with a high protein coding probability was calculated, 

20 starting from either the predicted amino terminus of the 
mature proteins or from the first amino acid for the 
mature protein determined by amino acid sequencing HPLC 
purified subunits. These data were then compared with the 
experimentally -determined compositions of the individual 

25 subunits as described by Tamura et al. Biochem. 21:5516, 
1982. Based on the similarity of the amino acid 
compositions shown in Table 3, all five subunits were 
identified and assigned to the ORF regions shown in Fig. 
4d. Table 3 shows that the deduced amino acid composition 

30 from all five assigned subunits are in good agreement with 
the experimentally-determined compositions of Tamura et al 
supra, with two significant exceptions. First, the SI 
subunit contains no lysine residues in the deduced amino 
acid sequence, whereas 2.2% lysine was experimentally 

35 determined. Second, in subunits S2, S3, S4, and S5 the 
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proportion of cysteines are substantially underestimated 
in the experimentally observed compositions. These 
discrepancies, as well as the remaining minor differences 
observed for all subunits, including the previously 
assigned S4 subunit, can most reasonably be explained by 
5 experimental error during amino acid analysis. Similar 
analyses, in which a DNA-deduced amino acid composition 
was compared with an experimentally-derived amino acid 
composition show the same minor differences. The absence 
of lysine residues in SI may explain why lysine-specif ic 
10 chemical modification does not affect the biological and 

enzymatic activities of si. The amino acid composition of 
the ORFs (Fig. 4b, c) not assigned to any subunit show no 
similarity to any of the experimentally-determined amino 
acid compositions, although some of these ORFs are quite 
15 long and have a high coding potential. It is possible 
that these regions code for other proteins, perhaps 
involved in the assembly or transport of pertussin toxin. 

The experimentally-estimated molecular weight 
and isoelectric point of the individual subunits were 
20 compared to the calculated molecular weight and ratio of 
acidic to basic amino acids of the putative proteins 
encoded by the ORFs shown in Fig. 4. As expected for this 
comparison, Table 3 shows that differences in the ratios 
reflect corresponding differences in the observed 
25 isoelectric points for each subunit, i.e., the higher the 
acidic content, the lower the isoelectric point. The 
comparison of the molecular weights also shows good 
correspondence to the experimentally-determined values, 
with slight differences for the SI (less than 10%) and the 
30 S5 (about 15%) subunits. These small differences are 
within acceptable limits for protein molecular weights 
determined by SDS-PAGE. 

35 
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Table 4 

Comparison of Two Homologous Regions in ADP-ribosylating subunits 
of Pertussis, Cholera, and E* £oH Heat Labile Toxins 



Region 1 



Pertussis SI subunit (8) Tyr 
Cholera 4 A subunit (6) Tyr 

E. £pJi 4 HLT A Subunit (6) Tyr 



Arg Tyr Asp Ser Arg Pro Pro (15) 
Arg Ala Asp Ser Arg Pro Pro (13) 
Arg Ala Asp Ser Arg Pro Pro (13) 



Region 2 



Pertussis SI subunit (51) Val Ser Thr Ser Ser Ser Arg Arg (58) 

Cholera 3 A subunit (60) Val Ser Thr Ser lie Ser Leu Arg (67) 

E. gojj 4 HLT A Subunit (60) Val Ser Thr Ser Leu Ser Leu Arg (67) 

The numbers in parentheses refer to the amino acid position in the mature 
proteins. 

'Data from Yamamoto, et al. FEBS Letter 169:241, 1983 
HLT - Heat Labile Toxin 
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° The assignment for SI in the location shown in 

Fig. 4d is further supported by a significant homology of 
two regions in the SI amino acid sequence with two related 
regions in the A subunits of both cholera and E. coli heat 
labile toxins. These homologous regions, shown in Table 

5 4, may be part of functional domains for a catalytic 
activity in the subunits for all three toxins. 
Furthermore, the assignment for SI, as well as the correct 
prediction of the signal peptide cleavage site, is 
supported by preliminary amino acid sequence data for the 

10 mature protein (unpublished results) . 

Subunits S2 and S3 share 70% amino acid 
homology, which makes the correct assignment of these 
subunits to their ORFs difficult if it is based only on 
the amino acid composition and the molecular weight. 

15 Nevertheless, the gene order could be determined as shown 
in Fig. 4d based on the location of a Tn!5-induced mutation 
responsible for the lack of active pertussis toxin in the 
supernatant of the mutant B. pertussis strains. This Tn5 
insertion was mapped 1.3 kb downstream of the start site 

20 for the S4 subunit gene, as indicated by the arrow in Fig. 
4a. As can be seen in Fig. 4, the Tn5-insertion in those 
mutants would be located in the ORF for S3 . Although 
unable to produce active pertussis toxin, the mutants are 
still able to produce the S2 subunit. Thus, the Tn5- 

25 insertion in those mutants is not located in the 

structural gene for S2 . Therefore, the ORFs for S2 and S3 
could be differentiated. 



Amino acid sequences . 

The amino acid sequence for each subunit was 
deduced from the nucleotide sequence and is shown in Table 
2. The mature proteins contain 234 amino acids for SI, 
199 amino acids for S2, 110 amino acids for S4 , 100 amino 
acids for S5 and 199 amino acids for S3, in the order of 
the gene arrangement from the 5 '-end to the 3 '-end. Most 
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25 



35 



likely all subunits contain signal peptides, as expected 
for secretory proteins. The length of the putative signal 
peptides was estimated after the analyses of the 
hydrophobicity plot, the predicted secondary structure and 
application of von Heijne's rule for the prediction of the 
most probable signal peptide cleavage site. The cleavage 
site for each subunit is shown in Table 2 by the 
asterisks. The correct prediction of the cleavage sites 
for S4 and SI (unpublished) was confirmed by amino 
terminal sequencing of the purified mature subunits. The 
length of the signal peptides varies from 34 residues for 
SI, 28 residues for S3, and 27 residues for S2, to 21 
residues for S4, and 20 residues for S5. All of the 
signal peptides contain a positively-charged amino 
terminal region of variable length, followed by a sequence 
of hydrophobic amino acids, usually in oc-helical or 
partially oe-helical, partially /3-pleated conformation. A 
less hydrophobic carboxy-terminal region follows, usually 
ending in /8-turn conformation at the signal peptide 
cleavage site. All subunits except S5 follow the -i, -3, 
rule, which positions the cleavage site after Ala-X-Ala. 
The amino-terminal charge for the subunit signal peptides 
varies between +4 for SI and +1 for S4 and S5. All 
described properties correspond very well to the general 
properties for bacterial signal peptides. 

Two different initiation codons are used for the 
translation of all subunits in JL_ pertussis , i.e., the 
most frequently used ATG for SI, S2, S3 and S5, and the 
less frequently used GTG for S4. The codon usage (Table 
4) is unsuitable for efficient translation of the 
pertussis toxin gene in E± coli . This is reflected by the 
codon choice for frequently used amino acids, such as 
alanine, arginine, glycine, histidine, lysine, proline, 
serine and valine. Whether pertussis toxin is a strongly 
or weakly expressed protein in B. pertussis and whether 
this expression is regulated by the presence of a precise 
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relative amount of the different tRNA isoacceptors, 
possibly different from |L_ coli . remains to be 
established. This can be evaluated by in vitro 
translation using E± coli and B. pertussis cell free 
extracts . 

Closer examination of the amino acid sequence 
reveals the striking absence of lysines in SI. Another 
interesting feature is the overall relatively high amount 
of cysteines as compared to E. coli proteins. Cysteines 
do not seem to be involved in inter-subunit links to 
construct the quaternary structure of the toxin, since all 
subunits can be easily separated by SDS-PAGE in the 
absence of reducing agents. Most likely, the cysteines 
are involved in intrachain bonds, since reducing agents 
significantly change the electrophoretic mobility of all 
subunits but S4. Serines, threonines and tyrosines also 
are represented more frequently than in average E. coli 
proteins. The hydroxy 1 groups of these residues may be 
involved in the quaternary structure through hydrogen 
bonding. 

Analysis of the flanking regions. 

Since all pertussis toxin subunits are closely 
linked and probably expressed in a very precise ratio, it 
is possible that they are arranged in a polycistronic 
operon. A polycistronic arrangement for the subunit 
cistrons also has been described for other bacterial 
toxins bearing similar enzymatic functions, such as 
diphtheria cholera and E. coli heat labile toxins. 
Therefore, the flanking regions were analyzed for the 
presence of transcriptional signals. In the 5' flanking 
region, starting at position 4 69, the sequence TAAAATA was 
found, which six of the seven nucleotides found in the 
ideal TATAATA Pribnow or -10 box. An identical sequence 
can be found in several other bacterial promoters, 
including the lambda L57 promoter. Given the fact that 
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o most transcripts start as a purine residue about 5-7 
nucleotides downstream from the Pribnow box, the 
transcriptional start site was tentatively located at the 
adenine residue at position 482. This residue is located 
in the sequence CAT, often found at transcriptional start 
5 sites. Upstream from the proposed -10 box, the sequence 

CTGACC starts at position 442. This sequence matches four 
of the six nucleotides found in the ideal E. coli -35 box 
TTGACA. The mismatching nucleotides in the proposed 
pertussis toxin -35 box are the two end nucleotides, of 

10 which the 3' residue is the less important nucleotide in 
the E. coli -35 consensus box. A replacement of the T by 
a C in the first position of the consensus sequence can 
also be found in several E. coli promoters. The distance 
between the two proposed promoter boxes is 21 nucleotides, 

15 a distance of the same length has been found in the galPl 
promotor and in several plasmid promoters. The proposed - 
35 box is immediately preceded by two overlapping short 
inverted repeats with calculated free energies of -15.6 
kcal and -8.6 kcal, respectively. Inverted repeats can 

2Q also be found at the 5 '-end of the cholera toxin promotor. 
In both cases, they may be involved in positive regulation 
of the toxin promoters. None of the ORFs assigned to the 
other subunit is closely preceded by a similar promoter- 
like structure. However, a different promoter-like 

25 structure was found associated with the £34 subunit orf. 

The 3 '-flanking region has been examined for the 
presence of possible transcriptional termination sites. 
Several inverted repeats could be found; the most 
significant is located in the region extending from 

30 Position 4031 to 4089 and has a calculated free energy of 
-41.4 kcal. None of the inverted repeats are immediately 
followed by an oligo(dT) stretch, which may suggest that 
they function in a rho-dependent fashion. Preliminary 
experiments indicate, however, that neither inverted 

35 repeat functions efficiently in E_s_ coli (results not 
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° shown) . Whether they are functional in B_s. pertussis 

remains to be established and can be investigated by a 
small deletion or site-directed mutagenesis experiments, 
which are feasible now that the DNA sequence is known. 
Another possibility is that the five different subunits 

5 may not be the only proteins encoded in the polycistronic 
operon and that cistrons for other peptides, possibly 
involved in regulation, assembly or transport, are 
cotranscribed. Non-structural proteins involved in the 
posttranslational processing of EL. coli heat labile toxin 

10 have been proposed. However, no significantly long ORF 
was found at the 3 '-end of the nucleotide sequence shown 
in Fig. 4b. If other proteins are encoded by the same 
polycistronic operon, their coding regions must be located 
further downstream. 

!5 Additionally, the 5 '-flanking region of each 

cistron was also examined for the presence of ribosomal 
binding sites. Neither the ribosomal binding sequences 
for B. pertussis genes, nor the 3 '-end sequence of the 16S 
rRNA are known. Therefore, the flanking regions could be 

20 compared with only the ribosomal binding sequences of 
heterologous procaryotic organisms represented by the 
Shine-Dalgarno sequence. Preceding the SI initiation 
codon, the sequence GGGGAAG was found starting at position 
495. This sequence shares four out of seven nucleotides 

25 with ideal Shine-Dalgarno sequence AAGGAGG. The two first 
mismatching nucleotides in the pertussis toxin gene would 
not destabilize the hybridization to the 3 '-end of the E. 
coli 16 S rRNA. This putative ribosomal binding site is 
close enough to the initiation codon for SI to be 

30 functional in E_j_ coli . Another possible Shine-Dalgarno 

sequence overlaps the first one and also matches four out 
of seven nucleotides to the consensus sequence. The 
mismatching nucleotides, however, have a more 
destabilizing effect than the ones found in the first 

32 sequence. The S2 subunit ORF is not closely preceded by a 
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° ribosomal binding sequence, which may suggest that S2 is 
translated through a mechanism not involving the 
detachment and reattachment of the ribosome between the 
coding regions for SI and S2. The short distance between 
the SI and S2 cistrons, and the absence of a ribosomal 

5 binding site are characteristic of this mechanism. A 

ribosomal binding site for S4 in the sequence CAGGGCGGC, 
starting at position 2066 is possible. The ORF for S5 is 
preceded by the sequence AAGGCG , starting at position 
2485, which matches five out of six nucleotides in the 

10 consensus sequence AAGGAG. Finally , S3 is preceded by the 
sequence GGGAACAC, which is very similar to the proposed 
ribosomal binding site for SI, i.e., GGGAAGAC. 

Taken as a whole, the results described herein 
clearly establish the complete nucleotide sequence of all 

15 structural cistrons for pertussis toxin. The gene order, 
as shown in Fig. 4, is SI, S2, S4, S5, and S3. The 
calculated molecular weights from the deduced sequence of 
the mature peptides are 26,024 for Si; 21,924 for S2; 
12,058 for S4; 11,013 for S5 and 21,873 for S3. Since S4 

20 is present in two copies per toxin molecule, the total 

molecular weight for the holotoxin is about 104950. This 
is in agreement with the apparent molecular weight 
estimated by non-denaturing PAGE. The most striking 
feature of the predicted peptide sequences is the high 

25 homology between S2 and S3. The two peptides share 70% 
amino acid homology and 75% nucleotide homology. This 
suggests that both cistrons were generated through a 
duplication of an ancestral cistron followed by mutations 
which result in functionally-different peptides. The 

30 differences between S2 and S3 are scattered throughout the 
whole sequence and are slightly more frequent in the 
amino-terminal half of the peptides. Despite their high 
homology, also reflected in the predicted secondary 
structures and hydrophilicities, S2 and S3 subunits cannot 

35 substitute for each other in the functionally-active 
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° pertussis toxin. The comparison between the two subunits 
may be useful in localizing their functional domains in 
relation to their primary, secondary and tertiary 
structure. On the basis of the differences, S2 and S3 are 
divided into two domains, the amino-terminal and the 
5 carboxy-terminal. Each of the subunits binds to a S4 
subunit. This function could be located in the more 
conserved carboxyl-tenninal domains of S2 and S3 . The two 
resulting dimers are thought to bind to one S5 subunit. 
This function could be assigned to the more divergent 

10 amino-terminal domains of S2 and S3. Alternatively, it is 
possible that the dimers bind to the S3 subunit through S4 
and that the amino-terminal domains of S2 and S3 are 
involved in some other function, possibly the interaction 
of the binding moiety (S2 through S5) with the 

15 enzymatically-active moiety (SI) . 

The enzymatically-active SI subunit was compared 
to the A subunits of other bacterial toxins. Two regions 
with significant homology to cholera and EL. coli heat 
labile toxins were found (Table 4) . They are tandemly 

20 located in analogous regions of all three toxins. 

However, the three amino acid differences found in these 
regions cannot be explained by single base pair changes in 
the DNA. Furthermore, in most cases the homologous amino 
acids use quite different codons in pertussis toxin 

25 compared to cholera and EL coli heat labile toxins. This, 
together with the fact that no other significant homology 
in the primary structure could be found and that the amino 
acid sequences of the other subunits are completely 
different from the sequence of any other ADP-ribosylating 

30 toxin, strongly suggests that pertussis toxin is not 

evolutionarily -related to any of the other known bacterial 
toxins. The limited homology of SI subunit to the A 
subunits of cholera and E coli heat labile toxins could be 
due to convergent evolution, since all three toxins 

35 contain a very similar enzymatic activity and use a 
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° relatively closely-related acceptor substrate (Ni protein 
for pertussis toxin and Ns protein for cholera and E_j_ coli 
heat labile toxins) . The NAD-binding site for the two 
enterotoxins has been identified at the carboxy-terminal 
region of their Al subunit. No significant homology could 
5 be found between the carboxy-terminus of the enterotoxins, 
nor any other NAD-binding enzymes, and the analogous 
region in the SI subunit. This suggests that the NAD- 
binding function of the ADP-ribosylating enzymes is 
dependent more on the secondary or tertiary structures, 

10 than on the primary structures. It is proposed that the 
two enzymatically-active domains lie in different regions 
of the protein, one at the amino-terminal half of the 
subunit for the acceptor substrate (Ni) binding and the 
other at the carboxy-terminal half of the subunit for the 

15 donor substrate (NAD+) binding. 

The presence of a promoter- like structure 
upstream of the SI subunit cistron and possible 
transcriptional termination signals downstream of the S3 
subunit cistron suggests that pertussis toxin, like many 

20 other bacterial toxins, is expressed through a 

polycistronic mRNA. The inverted repeats immediately 
preceding the proposed promoter may be sites for positive 
regulation of expression of the toxin in B. pertussis . 
Evidence for a positive regulation came through the 

25 discovery of the vir gene, the product of which is 

essential for the production of many virulence factors, 
including pertussis toxin. Recent evidence in our 
laboratory suggests that the proposed inverted repeats in 
the 3' flanking region are not very efficient in 

30 transcriptional termination in E. coli (results not 

shown) . The termination of transcription in B. pertussis 
may be carried out by a slightly different mechanism than 
in E. coli ; on the other hand, the polycistron may contain 
other, not yet identified, genes related to expression of 

35 functionally-active pertussis toxin or other virulence 
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° factors. We have described a promoter-like structure 
preceding subunit S4 and possible termination signals 
following the S4 cistron. The S4 promoter-like structure 
is quite different from the proposed promoter at the 
beginning of SI subunit. It is part of an inverted 

5 repeat, suggesting an iron regulation of the S4 subunit 

expression. This is supported by the fact that chelating 
agents stimulate the accumulation of active pertussis 
toxin in cell supernatants . It is thus possible that 
pertussis toxin is expressed efficiently by two dissimilar 

10 promoters, one (promoter 1) located in the 5 '-flanking 

region and the other (promoter 2) located upstream of S4. 
Both promoters would be regulated by different mechanisms. 
Promoter 1 would be positively regulated, possibly by the 
vir gene product, and promoter 2 would be negatively 

15 regulated by the presence of iron. In optimal expression 
conditions, such as. in the presence of the vir gene 
product and in the absence of iron, the S4 subunit cistron 
would be transcribed twice for every transcription of the 
other subunits. This is a mechanism that would explain 

20 the stoichiometry of the pertussis toxin subunits of 
1:1:1:2:1 for S1:S2:S3:S4:S5, respectively, in the 
biologically active holotoxin. 

Attempts to express the pertussis toxin gene in 
E. coli have been heretofore unsuccessful, although very 

25 sensitive monoclonal and polyclonal antibodies are 

available. This lack of expression of E. coli may reside 
in the fact that B. pertussis promoters are not 
efficiently recognized by the E. coli RNA polymerase. 
Analysis of the promoter-like structures of the pertussis 

30 toxin gene and their comparison to strong E. coli 

promoters show very significant differences, indeed, of 
which the most striking ones are the unusual distances 
between the proposed -3 5 and -10 boxes in the pertussis 
toxin promoters. The distance between those two boxes in 

35 strong E. coli promoters is around 17 nucleotides, whereas 
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° the distances in the two putative pertussis toxin 

promoters are 21 nucleotides for the S4 subunit promoter. 
Preliminary results in our laboratory using expression 
vectors designed to detect heterologous expression signals 
which are able to function in E. coli further indicate 

5 that £. pertussis promoters may not be recognized by the 
E. coli expression machinery. In addition, the codon 
usage for pertussis toxin is extremely inefficient for 
translation in E. coli (Table 5) . Preliminary experiments 
show that the insertion of a fused lac/trp promoter in the 

10 Kpnl site upstream of the pertussis toxin operon probably 
enhances transcription but does not produce detectable 
levels of pertussis toxin (unpublished results) . 
Efficient expression in E. coli would require resynthesis 
of the pertussis toxin operon, respecting the optimal 

15 codon usage for expression in B. pertussis , since no other 
B. pertussis gene has heretofore been sequenced. 

The cloned and sequenced pertussis toxin genes 
are useful for the development of an efficient and safer 
vaccine against whooping cough. By comparison to other 

20 toxin genes with similar biochemical functions are by 

physical identification of the active sites either for the 
ADP-ribosylation in the SI subunit or the target cell 
binding in subunits S2 through S4. It is now possible to 
modify those sites by site-directed mutagenesis of the B. 

25 pertussis genome. These modifications could abolish the 
pathobiological activities of pertussis toxin without 
hampering its immunogenicity and protect ivity. 
Alternatively, knowing the DNA sequence, mapping of 
eventual protective epitope is now made possible. 

30 Synthetic oligopeptide comprising those epitopes will also 
be useful in the development of a new generation vaccine. 
EXAMPLE 1 

The region containing amino acid residues 8 
through 15 of the SI without (called "homology box") was 
35 chosen for site-directed mutagenesis which was 
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° accomplished by employing standard methodologies well 
known in the art. The specific codon changes and the 
resultant amino acid alterations are shown in Table 6. 

To effect the mutagenic alterations, 
oligonucleotides [Beucage et al. Tetrahedron Lett 22, 

5 1859, (1981)] were synthesized that incorporated a series 
of single-codon and double-codon substitution mutations 
within the homology box: in addition, a mutation was also 
designed that allowed for selective deletion of the 
homology region. Two previously described SI expression 

10 vectors were used for construction of plasmids mutated in 
the homology box: pPTXSl/6A and pPTXSl/33B [Cieplak et 
al., Proc. Natl. Acad. Sci. U.S.A. 85. 4667 (1988)]. 
S1/6A is an SI analog in which the mature amino-terminal 
aspartyl-aspartate is replaced with methionylvaline. Both 

15 enzymatic activity and mAb 1B7 reactivity are retained in 
S1/6A, whereas S1/33B has neither (Cieplak, supra ) . The 
expression vector for each SI substitution mutant was 
constructed in a three-way ligation using the appropriate 
oligonucleotide with Acc I and Bsp Mil cohesive ends, an 

20 1824-bp DNA fragment from pPTXSl/6A (Acc I-SstI) , and a 
3.56-kb DNA fragment from p.TXSl/33B (Bsp Mll-Sst II). 
The ligation and the relatively short length of the 
oligonucleotides required for the substitution was 
facilitated by the presence of novel Bsp Mil and Nla IV 

25 restriction sites generated in the original construction 
of pPTXSl/33B. Deletion of the homology box involved 
ligation of mung bean nuclease-blunted Acc I site to the 
left of the box in pPTXSl/6A, and an Nla IV site to the 
right of the box in S1/33B: this ligation resulted in the 

30 excision of codons for Tyr & through Pro 15 ., Vector 

construction and retention of the altered sites were 
confirmed by standard restriction analysis and partial DNA 
sequence analysis. 

The expression vector constructions were 

35 transformed into E. coli and the mutant SI genes were 
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° expressed after temperature induction. In this expression 
system [Burnette et al. Bio /Technology 6, 699 (1988)], the 
recombinant SI polypeptides are synthesized at high 
phenotypic levels (7 to 22% of total cell protein) and 
aggregated into intracellular inclusions. Inclusion 

5 bodies were recovered after cell lysis (Burnette, supra ) 
and examined by SDS-polyacrylamide gel electrophoresis 
(PAGE) [U. K. Laemmli, Nature 227, 680 (1970)] (Fig. 6A) . 
The electrophoresis profile revealed that the mutagenized 
SI products constituted the predominant protein species in 

10 each preparation and that their mobilities were very 
similar to that of the present S1/6A subunit. 

To examine the phenotypic effects of the 
mutations on antigenicity, the mutant SI polypeptides were 
assayed for their ability to react with the protective mAb 

15 1B7 in an immunoblot format. The parent construction 6A 
(Table 6) and each of the single-codon substitution 
mutants (5-1, 4-1, 3-1, 2-2 and 1-1) retained reactivity 
with mAb 1B7 (Fig. 6B) . In contrast, the reactivity of 
those mutants containing double-residue substitutions (8- 

20 If 7-2, and 6-1) , as well as the mutant in which the 

homology box had been deleted (6A-1), was significantly 
diminished or abolished. 

The mutant SI molecules were assayed for ADP- 
ribosyltransferase activity by measuring the transfer of 

25 radiolabeled ADP-ribose from [adenylate-**P]NAD to 

purified bovine transducing [Watkins et al. J. Biol. 
Chem. . 259, 1378 (1984): Manning et al. ibid , p. 749], a 
guanine nucleotide-binding regulatory protein found in the 
rod outer segment membranes [Stryer et al. Annu. Rev. Cell 

30 Biol. 2. 391 (1986)]. As shown in Table 6, each of the 
substitutions appeared to reduce specific ADP— 
ribosyltransf erase activity, with the exception of mutants 
5-1 and 2-2, which retained the full activity associated 
with the parent 6A species: 6A has approximately 60% of 

35 the ADP-ribosyltransf erase activity of authentic Si 
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° (Cieplak, supra ) . Neither mutant 4-1 nor any of the 
double-substitution mutants exhibited any significant 
transferase activity when compared to the inclusion body 
protein control (denoted 20A) : this control is a 
polypeptide of Mr-2 1,678, derived from a major alternative 
5 open reading frame (orf) in the SI gene and does not 
contain SI subunit-related sequences. 

The most noteworthy SI analog produced was 4-1 
(Arg 9 -Lys) . It alone among the single-substitution 
mutants exhibited little or no transferase activity under 

10 the conditions used (Table 6) ; however, unlike the double 
mutants, it retained reactivity with neutralizing mAb 1B7 . 

The results presented herein clearly demonstrate 
the importance and magnitude of the critical effect 
exerted by substitution of Arg on the enzymatic mechanisms 

15 of the SI subunit. It is noteworthy in this report that 
when the Arg-Lys mutation was introduced into full-length 
recombinant SI, it was found that transferase activity 
was reduced by a factor of approximately 1000. This 
result establishes that the substitution at residue 9 is 

20 alone sufficient to attain the striking loss in enzyme 

activity and that the coincidental replacement of the two 
amino-terminal asparate residues in the mature SI sequence 
with the Met-Val dipeptide that occurs in S1/6A is not 
required to achieve this reduction. 

25 In summary, a mutant gene directing the 

synthesis of a mutant PTX polypeptide containing the 
protective epitope, but with substantially reduced enzyme 
activity has been produced. A safe vaccine against 
pertussis, in accordance with the present invention, is 

30 produced by a composition comprising immunogenic amount of 
the mutant PTX polypeptide in a pharmaceutical ly 
acceptable carrier. The term "substantially reduced" 
enzyme activity as used herein means more than about 1000 
fold less enzymatic activity or almost negligible enzyme 

35 activity compared to the normal (wild type) activity. 
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° it is understood that the examples and 

embodiments described herein are for illustrative purposes 
only and that various modifications or changes in light 
hereof will be suggested to persons skilled in the art and 
to be included within the spirit and purview of this 

5 application and the scope of the appended claims. 



10 
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Table 6. ADP-ribosyltransf erase activity of recombinant 
SI mutant polypeptides. Intracellular ly inclusions 
containing the recombinant subunits provided in E. coli 
were recovered by differential centrifugation and 
extracted with 8M urea (18). The urea extracts were 
adjusted to a total protein concentration of 0.6 mg/ml, 
dialyzed against 50 mM tris-HCl (pH 8.0), and then 
centrifuged at 14,000g for 30 min. The amount of 
recombinant product in the supernatant fractions was 
determined by quantitative densitometric scanning of 
proteins separated by SDS-PAGE and stained with Coomassie 
blue. ADP-ribosyltransf erase activity was determined (17) 
with the use of 4.0 jig of purified bovine transducin and 
100 mg of each SI analog. The values represent the 
transfer of [ 32 P] - ADP-ribose to the oc subunit of 
transducin, as measured by total trichloroacetic acid- 
precipitable radioactivity, and each is given as the mean 
of triplicate determinations with standard deviation. The 
2 OA product represents a negative control because its 
synthesis results in the formation of intracellular 
inclusions that lack Sl-related proteins. 



Mutant 
designation 


Amino acid change 


Codon change 


ADP-ribosyl- 
transferase 
activity (cpm) 


6A 


None 


None 


23,450 ± 950 


5-1 


Tyr 8 - Phe 


TAC - TTC 


26,361 ± 1,321 


4-1 


Arg 9 -» Lys 


CGC - AAG 


754 ± 7 


3-1 


Asp" - Glu 


GAC - GAA 


13,549 ± 1,596 


2-2 


Ser 12 - Gly 


TCC - GGC 


22,319 ± 2,096 


1-1 


Arg 13 - Lys 


CGC -+ AAG 


7,393 ± 1,367 


8-1 


Tyr 8 — Leu 


TAC — TTG 


926 ± 205 




Arg 9 - Glu 


CGC - GAA 




7-2 


Arg 9 -» Asn 


CGC -* AAC 


753 ± 30 




Ser 12 - Gly 


TCC - GGC 




6-1 


Asp 11 - Pro 


GAC -» CCG 


764 ± 120 




Pro 14 - Asp 


CCG - GAC 




20A 


Alternate SI orf 




839 + 68 
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